SpeechRecognize
SpeechRecognize[audio] recognizes speech in audio and returns it as a string.
SpeechRecognize[audio,level] returns a list of strings at the specified structural level.
SpeechRecognize[audio,level,prop] returns prop for text at the given level.
Please visit the official Wolfram Language Reference for more details and examples on core symbols.